Revert VLM support in parse_response#5561
Conversation
|
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
There was a problem hiding this comment.
I'm not sure of this PR, as it seems to overlap quite a bit with what I'm currently addressing in #5489. The approach I’m taking there is intentionally incremental:
- First, ensure that we can consistently pass only tokenizer instances to
parse_response(by introducingself._tokenizeracross trainers). - Then, in a follow-up step, simplify parse_response to only accept tokenizers: Make parse_response accept only tokenizer
Given that, this PR feels somewhat like duplicated effort. Would it make sense to wait for #5489 to land instead?
For context, I’m already working through the relevant discussion here: #5489 (comment)
parse_responseonly needs a tokenizer instance but it had to handle both because we did not have a simple way to pass only tokenizer. Once we implementself._tokenizerin all trainers,parse_responsecould be simplified to accept only tokenizer instances.
and here: #5489 (comment)
More broadly, the underlying goal of this PR is to centralize the processor/tokenizer disambiguation within processing_class in a single place, so that the rest of the code can rely on a well-defined and consistent interface, with a clear expected class instance.
In that sense, the current change in calling
parse_responseis an intermediate step toward that simplification, rather than a deviation from it.
|
Ah yes ok, Lgtm @albertvillanova |
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 1 potential issue.
❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.
Reviewed by Cursor Bugbot for commit 973cf25. Configure here.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: fffa7a79d3
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

parse_responsepreviously accepted either a tokenizer or a processor (from #5323) and unwrapped the inner tokenizer on the fly. Now that call sites can easily pass the tokenizer directly, we move that disambiguation to the call sites and keepparse_responsestrictly tokenizer-only. This centralizes the "processor vs tokenizer" logic in one place per trainer and makesparse_response's contract simpler.Note
Medium Risk
This is a small but potentially breaking API change: callers that previously passed a VLM
ProcessorMixintoparse_responsemust now passprocessor.tokenizer, which could affect downstream integrations.Overview
Simplifies
parse_response’s contract to accept only aPreTrainedTokenizerBase(removing the internal “processor vs tokenizer” unwrapping logic) and updates its docstring accordingly.Adjusts
TestParseResponseto explicitly passprocessing_class.tokenizerfor VLM processors, keeping parsing behavior the same while moving the disambiguation to call sites.Reviewed by Cursor Bugbot for commit fb245d4. Bugbot is set up for automated code reviews on this repo. Configure here.